In Recent Advances in Parsing Technology
نویسندگان
چکیده
This chapter describes GLR*, a parser that can parse any input sentence by ignoring unrecognizable parts of the sentence. Using an eecient algorithm, the parser is capable of nding and parsing a maximal subset of the original input that is parsable, and therefore return the parse with fewest skipped words. The parser returns some parse(s) for any input sentence, unless no part of the sentence can be recognized at all. Formally, the problem can be deened in the following way: Given a context-free grammar G and a sentence S, nd and parse S 0-the largest subset of words of S, such that S 0 2 L(G). The algorithm described in this chapter is a modiication of the Generalized LR (Tomita) parsing algorithm (Tomita, (1986)). The parser accommodates the skipping of words by allowing shift operations to be performed from inactive state nodes of the Graph Structured Stack. A heuristic similar to beam search makes the algorithm computationally tractable. The modiied parser, GLR*, has been implemented and integrated with the latest version of the Generalized LR Parser/Compiler (Tomita et al., (1988), Tomita, (1990)). We discuss an application of the GLR* parser to spontaneous speech understanding and present some preliminary tests on the utility of the GLR* parser in such settings.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملA review on the recent advances in the biology and aquaculture technology of Holothuria scabra
The highly valued sea cucumber Holothuria scabra is currently listed as endangered (EN) species in the IUCN Red List due to overfishing in most of its native locations, spurring the development of H. scabra aquaculture to ensure sustainability of the industry and species in the wild. This review presents a summary on the biology of Holothuria scabra and recent advancements of its aquaculture te...
متن کاملON THE FUZZY SET THEORY AND AGGREGATION FUNCTIONS: HISTORY AND SOME RECENT ADVANCES
Several fuzzy connectives, including those proposed by Lotfi Zadeh, can be seen as linear extensions of the Boolean connectives from the scale ${0,1}$ into the scale $[0,1]$. We discuss these extensions, in particular, we focus on the dualities arising from the Boolean dualities. These dualities allow to transfer the results from some particular class of extended Boolean functions, e.g., from c...
متن کاملRecent Advances in Microextraction Methods for Sampling and Analysis of Volatile Organic Compounds in Air: A Review
Human exposures to volatile organic compounds (VOCs) are associated with a wide range of health problems. Due to these adverse effects of VOCs on the human health, determination of trace levels of VOCs is very important for accurate assessment of indoor and outdoor exposure. Solid phase microextraction (SPME), needle trap device (NTD) and hollow fiber- liquid phase microextraction (HF-LPME) are...
متن کاملA Review of the Recent Advances and Application of 3D Printing in Pharmacy and Drug Delivery
Throughout human history, the most valuable inventions have been those that, even decades after their initial introduction, affected the lives of people around the world. 3D printers similar to steam engines, light bulbs, and the World Wide Web are thought to be among the inventions that will revolutionize the future of different industries. This technology is generally introduced as the manuf...
متن کاملTlex
TLex is a pattern matching and parsing library for C++. In comparison to existing pattern matching tools, TLex sets a new standard for expressiveness when nearly optimal speed is required. It incorporates recent advances in regular expression technology that make it easier to write patterns and extract information from a successful match. An overview of TLex is presented, the pattern and parsin...
متن کامل